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Abstract

   A micro-loop is a packet-forwarding loop that may occur transiently
   among two or more routers in a hop-by-hop packet-forwarding paradigm.

   This document analyzes the impact of using different link state IGP
   implementations in a single network with respect to micro-loops.  The
   analysis is focused on the Shortest Path First (SPF) delay algorithm
   but also mentions the impact of SPF trigger strategies.

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Not all documents
   approved by the IESG are candidates for any level of Internet
   Standard; see Section 2 of RFC 7841.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   https://www.rfc-editor.org/info/rfc8541.
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1.  Introduction

   Link state IGP protocols are based on a topology database on which
   the SPF algorithm is run to find a consistent set of non-looping
   routing paths.

   Specifications like IS-IS [RFC1195] propose some optimizations of the
   route computation (see Appendix C.1 of [RFC1195]), but not all
   implementations follow those non-mandatory optimizations.

   In this document, we refer to the events that lead to a new SPF
   computation based on the topology as "SPF triggers".

   Link state IGP protocols, like OSPF [RFC2328] and IS-IS [RFC1195],
   use multiple timers to control the router behavior in case of churn:
   SPF delay, Partial Route Computation (PRC) delay, Link State Packet
   (LSP) generation delay, LSP flooding delay, and LSP retransmission
   interval.

   Some of the values and behaviors of these timers are standardized in
   protocol specifications, and some are not.  The SPF computation-
   related timers have generally remained unspecified.

   Implementations are free to implement non-standardized timers in any
   way.  For some standardized timers, implementations may offer
   dynamically adjusted timers to help control the churn rather than use
   static configurable values.

   "SPF delay" refers to the timer in most implementations that
   specifies the required delay before running an SPF computation after
   an SPF trigger is received.

   A micro-loop is a packet-forwarding loop that may occur transiently
   among two or more routers in a hop-by-hop packet-forwarding paradigm.
   These micro-loops are formed when two routers do not update their
   Forwarding Information Bases (FIBs) for a certain prefix at the same
   time.  The micro-loop phenomenon is described in [MICROLOOP-LSRP].

   Two micro-loop mitigation techniques have been defined by IETF.  The
   mechanism in [RFC6976] has not been widely implemented, presumably
   due to the complexity of the technique.  The mechanism in [RFC8333]
   has been implemented.  However, it does not prevent all micro-loops
   that can occur for a given topology and failure scenario.








Litkowski, et al.             Informational                     [Page 3]

RFC 8541              SPF Impact on IGP Micro-loops           March 2019


   In multi-vendor networks, using different implementations of a link
   state protocol may favor micro-loop creation during the convergence
   process due to discrepancies in timers.  Service providers already
   know to use timers with similar values and behaviors for all of the
   network as a best practice, but this is sometimes not possible due to
   the limitations of implementations.

   This document presents reasons for service providers to have
   consistent implementation of link state protocols across vendors.  In
   particular, this document analyzes the impact of using different link
   state IGP implementations in a single network with regard to micro-
   loops.  The analysis focuses on the SPF delay algorithm.

   [RFC8405] defines a solution that partially addresses this problem
   statement, and this document captures the reasoning of the provided
   solution.

2.  Problem Statement

                              S ---- E
                              |      |
                           10 |      | 10
                              |      |
                              D ---- A
                              |  2
                              Px

            Figure 1: Network Topology Experiencing Micro-loops

   Figure 1 represents a small network composed of four routers (S, D,
   E, and A).  Router S primarily uses the SD link to reach the prefixes
   behind router D (named Px).  When the SD link fails, the IGP
   convergence occurs.  If S converges before E, S will forward the
   traffic to Px through E; however, because E has not converged yet, E
   will loop the traffic back to S, leading to a micro-loop.

   The micro-loop appears due to the asynchronous convergence of nodes
   in a network when an event occurs.

   Multiple factors (or a combination of factors) may increase the
   probability of a micro-loop appearing:

   o  Delay of failure notification: The greater the time gap between E
      and S being advised of the failure, the greater the chance that a
      micro-loop may appear.
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   o  SPF delay: Most implementations support a delay for the SPF
      computation to catch as many events as possible.  If S uses an SPF
      delay timer of x ms, E uses an SPF delay timer of y ms, and x < y,
      E would start converging after S, leading to a potential micro-
      loop.

   o  SPF computation time: This is mostly a matter of CPU power and
      optimizations like incremental SPF.  If S computes its SPF faster
      than E, there is a chance for a micro-loop to appear.  Today, CPUs
      are fast enough to consider the SPF computation time as negligible
      (on the order of milliseconds in a large network).

   o  SPF computation ordering: An SPF trigger can be common to multiple
      IGP areas or levels (e.g., IS-IS Level 1 and Level 2) or to
      multiple address families with multi-topologies.  There is no
      specified order for SPF computation today, and it is
      implementation dependent.  In such scenarios, if the order of SPF
      computation done in S and E for each area, level, topology, or SPF
      algorithm is different, there is a possibility for a micro-loop to
      appear.

   o  RIB and FIB prefix insertion speed or ordering: This is highly
      dependent on the implementation.

   Even if all of these factors increase the probability of a micro-loop
   appearing, the SPF delay plays a significant role, especially in case
   of churn.  As the number of IGP events increases, the delta between
   the SPF delay values used by routers becomes significant; in fact, it
   becomes the dominating factor (especially when one router increases
   its timer exponentially while another one increases it in a smoother
   way).  Another important factor is the time to update the FIB.  As of
   today, the total FIB update time is the major factor for IGP
   convergence.  However, for micro-loops, what matters is not the total
   time but the difference in installing the same prefix between nodes.
   The time to update the FIB may be the main part for the first
   iteration but not for subsequent IGP events.  In addition, the time
   to update the FIB is very implementation specific and difficult or
   impossible to standardize, while the SPF delay algorithm may be
   standardized.

   As a consequence, this document will focus on an analysis of SPF
   delay behavior and associated triggers.
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3.  SPF Trigger Strategies

   Depending on the change advertised in the LSP or LSA (Link State
   Advertisement), the topology may or may not be affected.  An
   implementation may avoid running the SPF computation (and may only
   run an IP reachability computation instead) if the advertised change
   does not affect the topology.

   Different strategies can trigger the SPF computation:

   1.  An implementation may always run a full SPF for any type of
       change.

   2.  An implementation may run a full SPF only when required.  For
       example, if a link fails, a local node will run an SPF for its
       local LSP update.  If the LSP from the neighbor (describing the
       same failure) is received after SPF has started, the local node
       can decide that a new full SPF is not required as the topology
       has not changed.

   3.  If the topology does not change, an implementation may only
       recompute the IP reachability.

   As noted in Section 1, SPF optimizations are not mandatory in
   specifications.  This has led to the implementation of different
   strategies.

4.  SPF Delay Strategies

   Implementations of link state routing protocols use different
   strategies to delay SPF computation.  The two most common SPF delay
   behaviors are the following:

   1.  Two-step SPF delay

   2.  Exponential back-off delay

   These behaviors are explained in the following sections.
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4.1.  Two-Step SPF Delay

   The SPF delay is managed by four parameters:

   o  rapid delay: the amount of time to wait before running SPF after
      the initial SPF trigger event.

   o  rapid runs: the number of consecutive SPF runs that can use the
      rapid delay.  When the number is exceeded, the delay moves to the
      slow delay value.

   o  slow delay: the amount of time to wait before running an SPF.

   o  wait time: the amount of time to wait without detecting SPF
      trigger events before going back to the rapid delay.

   Figure 2 displays the evolution of the SPF delay timer (based on a
   two-step delay algorithm) upon the reception of multiple events.
   Figure 2 considers the following parameters for the algorithm: rapid
   delay (RD) = 50 ms, rapid runs = 3, slow delay (SD) = 1 s, wait time
   = 2 s.

   SPF delay time
       ^
       |
       |
   SD- |             x xx x
       |
       |
       |
   RD- |   x  x   x                    x
       |
       +---------------------------------> Events
           |  |   |  | || |            |
                           < wait time >

                  Figure 2: Two-Step SPF Delay Algorithm

4.2.  Exponential Back-Off Delay

   The algorithm has two modes: fast mode and back-off mode.  In fast
   mode, the SPF delay is usually delayed by a very small amount of time
   (fast reaction).  When an SPF computation is run in fast mode, the
   algorithm automatically moves to back-off mode (a single SPF run is
   authorized in fast mode).  In back-off mode, the SPF delay increases
   exponentially in each run.  When the network becomes stable, the
   algorithm moves back to fast mode.  The SPF delay is managed by four
   parameters:
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   o  first delay: amount of time to wait before running SPF.  This
      delay is used only when SPF is in fast mode.

   o  incremental delay: amount of time to wait before running SPF.
      This delay is used only when SPF is in back-off mode and
      increments exponentially at each SPF run.

   o  maximum delay: maximum amount of time to wait before running SPF.

   o  wait time: amount of time to wait without events before going back
      to fast mode.

   Figure 3 displays the evolution of the SPF delay timer (based on an
   exponential back-off delay algorithm) upon the reception of multiple
   events.  Figure 3 considers the following parameters for the
   algorithm: first delay (FD) = 50 ms, incremental delay (ID) = 50 ms,
   maximum delay (MD) = 1 s, wait time = 2 s

   SPF delay time
       ^
   MD- |               xx x
       |
       |
       |
       |
       |
       |             x
       |
       |
       |
       |          x
       |
   FD- |   x  x                        x
   ID  |
       +---------------------------------> Events
           |  |   |  | || |            |
                           < wait time >
          FM->BM -------------------->FM

              Figure 3: Exponential Back-Off Delay Algorithm
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5.  Mixing Strategies

   Figure 1 illustrates a flow of packets from S to D.  S uses optimized
   SPF triggering (full SPF is triggered only when necessary) and two-
   step SPF delay (rapid delay = 150 ms, rapid runs = 3, slow delay = 1
   s).  As the implementation of S is optimized, PRC is available.  For
   PRC delay, we consider the same timers as for SPF delay.  E uses an
   SPF trigger strategy that always computes a full SPF for any change
   and uses the exponential back-off strategy for SPF delay (first delay
   = 150 ms, incremental delay = 150 ms, maximum delay = 1 s).

   Consider the following sequence of events:

   o  t0=0 ms: A prefix is declared down in the network.  This event
      happens at time=0.

   o  200 ms: The prefix is declared up.

   o  400 ms: The prefix is declared down in the network.

   o  1000 ms: S-D link fails.

   +---------+-------------------+------------------+------------------+
   |   Time  |   Network Event   | Router S Events  | Router E Events  |
   +---------+-------------------+------------------+------------------+
   |   t0=0  |    Prefix DOWN    |                  |                  |
   |  10 ms  |                   | Schedule PRC (in | Schedule SPF (in |
   |         |                   |     150 ms)      |     150 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  160 ms |                   |    PRC starts    |    SPF starts    |
   |  161 ms |                   |     PRC ends     |                  |
   |  162 ms |                   |  RIB/FIB starts  |                  |
   |  163 ms |                   |                  |     SPF ends     |
   |  164 ms |                   |                  |  RIB/FIB starts  |
   |  175 ms |                   |   RIB/FIB ends   |                  |
   |  178 ms |                   |                  |   RIB/FIB ends   |
   |         |                   |                  |                  |
   |  200 ms |     Prefix UP     |                  |                  |
   |  212 ms |                   | Schedule PRC (in |                  |
   |         |                   |     150 ms)      |                  |
   |  214 ms |                   |                  | Schedule SPF (in |
   |         |                   |                  |     150 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  370 ms |                   |    PRC starts    |                  |
   |  372 ms |                   |     PRC ends     |                  |
   |  373 ms |                   |                  |    SPF starts    |
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   |  373 ms |                   |  RIB/FIB starts  |                  |
   |  375 ms |                   |                  |     SPF ends     |
   |  376 ms |                   |                  |  RIB/FIB starts  |
   |  383 ms |                   |   RIB/FIB ends   |                  |
   |  385 ms |                   |                  |   RIB/FIB ends   |
   |         |                   |                  |                  |
   |  400 ms |    Prefix DOWN    |                  |                  |
   |  410 ms |                   | Schedule PRC (in | Schedule SPF (in |
   |         |                   |     300 ms)      |     300 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  710 ms |                   |    PRC starts    |    SPF starts    |
   |  711 ms |                   |     PRC ends     |                  |
   |  712 ms |                   |  RIB/FIB starts  |                  |
   |  713 ms |                   |                  |     SPF ends     |
   |  714 ms |                   |                  |  RIB/FIB starts  |
   |  716 ms |                   |   RIB/FIB ends   |   RIB/FIB ends   |
   |         |                   |                  |                  |
   | 1000 ms |   S-D link DOWN   |                  |                  |
   | 1010 ms |                   | Schedule SPF (in | Schedule SPF (in |
   |         |                   |     150 ms)      |     600 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   | 1160 ms |                   |    SPF starts    |                  |
   | 1161 ms |                   |     SPF ends     |                  |
   | 1162 ms |   Micro-loop may  |  RIB/FIB starts  |                  |
   |         |  start from here  |                  |                  |
   | 1175 ms |                   |   RIB/FIB ends   |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   | 1612 ms |                   |                  |    SPF starts    |
   | 1615 ms |                   |                  |     SPF ends     |
   | 1616 ms |                   |                  |  RIB/FIB starts  |
   | 1626 ms |  Micro-loop ends  |                  |   RIB/FIB ends   |
   +---------+-------------------+------------------+------------------+

    Table 1: Route Computation When S and E Use Different Behaviors and
                          Multiple Events Appear
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   In Table 1, due to discrepancies in the SPF management and after
   multiple events of different types, the values of the SPF delay are
   completely misaligned between node S and node E, leading to the
   creation of micro-loops.

   The same issue can also appear with only a single type of event as
   shown below:

   +---------+-------------------+------------------+------------------+
   |   Time  |   Network Event   | Router S Events  | Router E Events  |
   +---------+-------------------+------------------+------------------+
   |   t0=0  |     Link DOWN     |                  |                  |
   |  10 ms  |                   | Schedule SPF (in | Schedule SPF (in |
   |         |                   |     150 ms)      |     150 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  160 ms |                   |    SPF starts    |    SPF starts    |
   |  161 ms |                   |     SPF ends     |                  |
   |  162 ms |                   |  RIB/FIB starts  |                  |
   |  163 ms |                   |                  |     SPF ends     |
   |  164 ms |                   |                  |  RIB/FIB starts  |
   |  175 ms |                   |   RIB/FIB ends   |                  |
   |  178 ms |                   |                  |   RIB/FIB ends   |
   |         |                   |                  |                  |
   |  200 ms |     Link DOWN     |                  |                  |
   |  212 ms |                   | Schedule SPF (in |                  |
   |         |                   |     150 ms)      |                  |
   |  214 ms |                   |                  | Schedule SPF (in |
   |         |                   |                  |     150 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  370 ms |                   |    SPF starts    |                  |
   |  372 ms |                   |     SPF ends     |                  |
   |  373 ms |                   |                  |    SPF starts    |
   |  373 ms |                   |  RIB/FIB starts  |                  |
   |  375 ms |                   |                  |     SPF ends     |
   |  376 ms |                   |                  |  RIB/FIB starts  |
   |  383 ms |                   |   RIB/FIB ends   |                  |
   |  385 ms |                   |                  |   RIB/FIB ends   |
   |         |                   |                  |                  |
   |  400 ms |     Link DOWN     |                  |                  |
   |  410 ms |                   | Schedule SPF (in | Schedule SPF (in |
   |         |                   |     150 ms)      |     300 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  560 ms |                   |    SPF starts    |                  |
   |  561 ms |                   |     SPF ends     |                  |
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   |  562 ms |   Micro-loop may  |  RIB/FIB starts  |                  |
   |         |  start from here  |                  |                  |
   |  568 ms |                   |   RIB/FIB ends   |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  710 ms |                   |                  |    SPF starts    |
   |  713 ms |                   |                  |     SPF ends     |
   |  714 ms |                   |                  |  RIB/FIB starts  |
   |  716 ms |  Micro-loop ends  |                  |   RIB/FIB ends   |
   |         |                   |                  |                  |
   | 1000 ms |     Link DOWN     |                  |                  |
   | 1010 ms |                   | Schedule SPF (in | Schedule SPF (in |
   |         |                   |       1 s)       |     600 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   | 1612 ms |                   |                  |    SPF starts    |
   | 1615 ms |                   |                  |     SPF ends     |
   | 1616 ms |   Micro-loop may  |                  |  RIB/FIB starts  |
   |         |  start from here  |                  |                  |
   | 1626 ms |                   |                  |   RIB/FIB ends   |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   | 2012 ms |                   |    SPF starts    |                  |
   | 2014 ms |                   |     SPF ends     |                  |
   | 2015 ms |                   |  RIB/FIB starts  |                  |
   | 2025 ms |  Micro-loop ends  |   RIB/FIB ends   |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   +---------+-------------------+------------------+------------------+

   Table 2: Route Computation upon Multiple Link Down Events When S and
                         E Use Different Behaviors
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6.  Benefits of Standardized SPF Delay Behavior

   Table 3 uses the same event sequence as Table 1.  Fewer and/or
   shorter micro-loops are expected using a standardized SPF delay.

   +---------+-------------------+------------------+------------------+
   |   Time  |   Network Event   | Router S Events  | Router E Events  |
   +---------+-------------------+------------------+------------------+
   |   t0=0  |    Prefix DOWN    |                  |                  |
   |  10 ms  |                   | Schedule PRC (in | Schedule PRC (in |
   |         |                   |     150 ms)      |     150 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  160 ms |                   |    PRC starts    |    PRC starts    |
   |  161 ms |                   |     PRC ends     |                  |
   |  162 ms |                   |  RIB/FIB starts  |     PRC ends     |
   |  163 ms |                   |                  |  RIB/FIB starts  |
   |  175 ms |                   |   RIB/FIB ends   |                  |
   |  176 ms |                   |                  |   RIB/FIB ends   |
   |         |                   |                  |                  |
   |  200 ms |     Prefix UP     |                  |                  |
   |  212 ms |                   | Schedule PRC (in |                  |
   |         |                   |     150 ms)      |                  |
   |  213 ms |                   |                  | Schedule PRC (in |
   |         |                   |                  |     150 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  370 ms |                   |    PRC starts    |    PRC starts    |
   |  372 ms |                   |     PRC ends     |                  |
   |  373 ms |                   |  RIB/FIB starts  |     PRC ends     |
   |  374 ms |                   |                  |  RIB/FIB starts  |
   |  383 ms |                   |   RIB/FIB ends   |                  |
   |  384 ms |                   |                  |   RIB/FIB ends   |
   |         |                   |                  |                  |
   |  400 ms |    Prefix DOWN    |                  |                  |
   |  410 ms |                   | Schedule PRC (in | Schedule PRC (in |
   |         |                   |     300 ms)      |     300 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   |  710 ms |                   |    PRC starts    |    PRC starts    |
   |  711 ms |                   |     PRC ends     |     PRC ends     |
   |  712 ms |                   |  RIB/FIB starts  |                  |
   |  713 ms |                   |                  |  RIB/FIB starts  |
   |  716 ms |                   |   RIB/FIB ends   |   RIB/FIB ends   |
   |         |                   |                  |                  |
   | 1000 ms |   S-D link DOWN   |                  |                  |
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   | 1010 ms |                   | Schedule SPF (in | Schedule SPF (in |
   |         |                   |     150 ms)      |     150 ms)      |
   |         |                   |                  |                  |
   |         |                   |                  |                  |
   | 1160 ms |                   |    SPF starts    |                  |
   | 1161 ms |                   |     SPF ends     |    SPF starts    |
   | 1162 ms |   Micro-loop may  |  RIB/FIB starts  |     SPF ends     |
   |         |  start from here  |                  |                  |
   | 1163 ms |                   |                  |  RIB/FIB starts  |
   | 1175 ms |                   |   RIB/FIB ends   |                  |
   | 1177 ms |  Micro-loop ends  |                  |   RIB/FIB ends   |
   +---------+-------------------+------------------+------------------+

     Table 3: Route Computation When S and E Use the Same Standardized
                                 Behavior

   As displayed above, there can be other parameters, like router
   computation power and flooding timers, that may also influence micro-
   loops.  In all the examples in this document comparing the SPF timer
   behavior of router S and router E, we have made router E a bit slower
   than router S.  This can lead to micro-loops even when both S and E
   use a common standardized SPF behavior.  However, by aligning
   implementations of the SPF delay, we expect that service providers
   may reduce the number and duration of micro-loops.

7.  Security Considerations

   This document does not introduce any security considerations.

8.  IANA Considerations

   This document has no actions for IANA.
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